9 November 2018

Interactive graphics with plotly

  • Impressive

  • Informative

  • Easy

From ggplot2 to plotly

ggplot2

  • More features and flexibility
  • Cannot do interactive graphics

plotly

  • Interactive plots, great for websites and dashboards
  • Also useful for exploring your data

Example of an interactive plot

It's this easy

Tidy data

## # A tibble: 6 x 6
##   country     continent  year lifeExp      pop gdpPercap
##   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
## 1 Afghanistan Asia       1952    28.8  8425333      779.
## 2 Afghanistan Asia       1957    30.3  9240934      821.
## 3 Afghanistan Asia       1962    32.0 10267083      853.
## 4 Afghanistan Asia       1967    34.0 11537966      836.
## 5 Afghanistan Asia       1972    36.1 13079460      740.
## 6 Afghanistan Asia       1977    38.4 14880372      786.

Code

gapminder %>% plot_ly(x = ~gdpPercap, y = ~lifeExp, size = ~pop, color = ~continent, frame = ~year, text = ~country, hoverinfo = "text",type = 'scatter',mode = 'markers') %>% layout(xaxis = list(type = "log"))

Two approaches

  1. ggplotly function
  • Takes a ggplot object and returns a plotly object
  • Doesn't pick up all customisation in ggplot, e.g. legend position
  • Easy way to add interactivity to ggplot
  1. plot_ly interface
  • Faster (computationally) than ggplotly
  • Less features than ggplot

ggplotly

  • With ggplotly can take this static ggplot and make it interactive

ggplotly

ggplotly(Plot1_gg)

Plotly basics

  • ggplot syntax: ggplot(data,mapping=aes()) + geom_bar() ...

  • plot_ly syntax: Borrows semantics from dplyr and tidyr packages

  • Attributes defined via plot_ly which sets 'global' attributes that are carried onto subsequent traces

  • Example:

plot_ly(economics, x = ~date, color = I("black")) %>%
 add_lines(y = ~uempmed) %>%
 add_lines(y = ~psavert, color = I("red"))

Exercises

Using the APRA dataset

  1. Using a box plot, identify the outlying funds by their rate of return. Colour by fund type.
  • hint: use type = "box"
  1. Using a scatter plot, identify the outlyers in terms of operating expense ratio and investment expense ratio.
  • hint: use type = "scatter"

plot_ly(data = data.frame(), ..., color, text, type)

  • Remember the ~ e.g. x=~'variable name'

Exercise answer 1: box plot

plot_ly(data=apra, x = ~`Ten-year rate of return`, 
        color = ~`Fund type`, text = ~`Fund name`, type = "box")

Exercise answer 2: scatter plot

plot_ly(data=apra, x = ~`Operating expense ratio`, 
        y = ~`Investment expenses ratio`, type = 'scatter', 
        mode = 'markers',text = ~`Fund name`, color = ~`Fund type`)

subplot

-The subplot function provides a flexible interface for merging multiple plotly objects into a single object

-A bit like facet_wrap and facet_grid from ggplot, e.g.

subplot

p1 <- apra %>% 
  plot_ly(x = ~`Investment expenses ratio`, y = ~`Ten-year rate of return`, 
          type = 'scatter', mode = 'markers',text = ~`Fund name`, 
          color = ~`Fund type`,size = ~`Total assets`^2,
          legendgroup= ~`Fund type`,showlegend = F)

p2 <- apra %>% 
  plot_ly(x = ~`Operating expense ratio`, y = ~`Ten-year rate of return`, 
          type = 'scatter', mode = 'markers',text = ~`Fund name`, 
          color = ~`Fund type`,size = ~`Total assets`^2,
          legendgroup= ~`Fund type`,showlegend = F)
 
p3 <- apra %>% 
  plot_ly(x = ~`Proportion of total assets in default or MySuper strategy`, 
          y = ~`Ten-year rate of return`, type = 'scatter', 
          mode = 'markers',text = ~`Fund name`, color = ~`Fund type`,
          size = ~`Total assets`^2,
          legendgroup= ~`Fund type`,showlegend = T)

subplot

apra.range<- apra %>% 
  select(`Operating expense ratio`,`Ten-year rate of return`) %>% 
  na.omit()

min.y <- floor(round(min(apra.range$`Ten-year rate of return`),1))
max.y <- ceiling(round(max(apra.range$`Ten-year rate of return`),1))

subplot(p1,p2,p3,titleY = TRUE,titleX=TRUE,shareY = TRUE) %>% 
  layout(yaxis = list(range = c(min.y,max.y)))

subplot

Drop down menus

Drop down menus

Drop down menus

Drop down menus

Error bars

Calculate regression coefficients and standard errors

m <- lm(data = apra,`Ten-year rate of return` ~ `Operating expense ratio`+ 
          `Investment expenses ratio` + 
          `Proportion of total assets in default or MySuper strategy`)
d <- broom::tidy(m) %>% arrange(desc(estimate))

Create a scatter (with the coeffients) and use error_x to get the error bars

plot_ly(d, x = ~estimate, y = ~term) %>%
  add_markers(error_x = ~list(value = std.error)) %>%
  layout(yaxis = list(title ="", autorange = "reversed")) 

Error bars

Ribbons

ribs <- lm(`Ten-year rate of return` ~ `Operating expense ratio`, 
           data = apra) %>% augment(.) 

apra$.rownames <- rownames(apra)
apra$.se.fit <- NULL
apra$.fitted <- NULL
apra <- left_join(apra,ribs %>% select(.rownames,.se.fit,.fitted),by=".rownames")

plot_ly(apra, x = ~`Operating expense ratio`, y = ~`Ten-year rate of return`, 
        type = 'scatter', mode = 'markers',text = ~`Fund name`) %>% 
  add_ribbons(
    ymin = ~.fitted - 1.96 * .se.fit,
    ymax = ~.fitted + 1.96 * .se.fit,
    line = list(color = 'rgba(7, 164, 181, 0.05)'),
    fillcolor = 'rgba(7, 164, 181, 0.2)',
    name = "Standard Error") %>% 
  layout(
    xaxis = list(range = c(0, 1)),
    yaxis = list(range = c(-5, 5)),
    showlegend = FALSE)

Ribbons

Conclusion

  • It is very easy to create impressive and highly informative interative graphics with plotly
  • It gets a little more complicated with some features (e.g. subplot,drop down menus,ribbons)
  • In some cases, ggploty might be the better way to go (i.e. construct your graph in ggplot then convert to plotly)

Resources